Specificity of the number of nouns in Czech and its annotation in Prague Dependency Treebank
نویسندگان
چکیده
The paper focuses on the way how the grammatical category of number of nouns will be annotated in the forthcoming version of Prague Dependency Treebank (PDT 3.0), concentrating on the peculiarities beyond the regular opposition of singular and plural. A new semantic feature closely related to the category of number (so-called pair/group meaning) was introduced. Nouns such as ruce ‘hands’ or klíče ‘keys’ refer with their plural forms to a pair or to a typical group even more often than to a larger amount of single entities. Since pairs or groups can be referred to with most Czech concrete nouns, the pair/group meaning is considered as a grammaticalized meaning of nouns in Czech. In the present paper, manual annotation of the pair/group meaning is described, which was carried out on the data of Prague Dependency Treebank. A comparison with a sample annotation of data from Prague Dependency Treebank of Spoken Czech has demonstrated that the pair/group meaning is both more frequent and more easily distinguishable in the spoken than in the written data.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملGrammatical number of nouns in Czech: linguistic theory and treebank annotation
The paper deals with the grammatical category of number in Czech. The basic semantic opposition of singularity and plurality is proposed to be enriched with a (recently introduced) distinction between a simple quantitative meaning and a pair/group meaning. After presenting the current representation of the category of number in the multi-layered annotation scenario of the Prague Dependency Tree...
متن کاملPrague Czech-English Dependency Treebank: Any Hopes For A Common Annotation Scheme?
The Prague Czech-English Dependency Treebank (PCEDT) is a new syntactically annotated Czech-English parallel resource. The Penn Treebank has been translated to Czech, and its annotation automatically transformed into dependency annotation scheme. The dependency annotation of Czech is done from plain text by automatic procedures. A small subset of corresponding Czech and English sentences has be...
متن کاملCoreference in Prague Czech-English Dependency Treebank
We present coreference annotation on parallel Czech-English texts of the Prague Czech-English Dependency Treebank (PCEDT). The paper describes innovations made to PCEDT 2.0 concerning coreference, as well as the coreference information already present there. We characterize the coreference annotation scheme, give the statistics and compare our annotation with the coreference annotation in Onton...
متن کامل(Pre-)Annotation of Topic-Focus Articulation in Prague Czech-English Dependency Treebank
The objective of the present contribution is to give a survey of the annotation of information structure in the Czech part of the Prague Czech-English Dependency Treebank. We report on this first step in the process of building a parallel annotation of information structure in this corpus, and elaborate on the automatic pre-annotation procedure for the Czech part. The results of the pre-annotat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Prague Bull. Math. Linguistics
دوره 96 شماره
صفحات -
تاریخ انتشار 2011